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Abstract 

Demographic estimates of population at risk often underpin epidemiologic research and public health surveillance 
efforts. In spite of their central importance to epidemiology and public-health practice, little previous attention has 
been paid to evaluating the magnitude of errors associated with such estimates or the sensitivity of epidemiologic 
statistics to these effects. In spite of the well-known observation that accuracy in demographic estimates declines 
as the size of the population to be estimated decreases, demographers continue to face pressure to produce 
estimates for increasingly fine-grained population characteristics at ever-smaller geographic scales. Unfortunately, 
little guidance on the magnitude of errors that can be expected in such estimates is currently available in the 
literature and available for consideration in small-area epidemiology. This paper attempts to fill this current gap by 
producing a Vintage 2010 set of single-year-of-age estimates for census tracts, then evaluating their accuracy and 
precision in light of the results of the 2010 Census. These estimates are produced and evaluated for 499 census 
tracts in New Mexico for single-years of age from 0 to 21 and for each sex individually. The error distributions 
associated with these estimates are characterized statistically using non-parametric statistics including the median 
and 2.5th and 97.5th percentiles. The impact of these errors are considered through simulations in which observed 
and estimated 2010 population counts are used as alternative denominators and simulated event counts are used 
to compute a realistic range fo prevalence values. The implications of the results of this study for small-area 
epidemiologic research in cancer and environmental health are considered. 



Introduction 

In recent years, a growing demand for small-area 
demographic estimates has been observed. Much of this 
demand comes from epidemiologists, who utilize these 
estimates for small-area surveillance efforts in the areas 
of cancer and environmental epidemiology in particular 
[1-5]. The potential of small-area epidemiology has gener- 
ated considerable excitement [1-5]; however, it has also 
created important challenges for the demographers who 
produce small-area estimates of population at risk as well 
as the epidemiologists who use them. At a fundamental 
level, it is well known that as the size of the population to 
be estimated decreases, errors in demographic estimates 
increase [6-11]. These errors can be surprisingly large 
[6-11], but at present their impact on small-area epidemi- 
ologic measures has been incompletely described, and the 
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implication of these errors for small- area health tracking 
and analytic epidemiology has not received an adequate 
amount of attention [12-17]. This paper attempts to fill 
this gap by characterizing the errors associated with a set 
of single -year- of- age estimates made at the level of United 
States census tracts and analyzing the potential sensitivity 
of small-area crude prevalence measures to these errors. 

This example is extreme in both its spatial scale (census 
tracts represent very small areas, often a single neighbor- 
hood) [18] as well as in the fine-grained age intervals to be 
estimated. Errors in census tract-level estimates in five- 
year age groupings reported in previous studies have 
ranged between as small as 10% [19] and as high as 80% 
or more [9]. It is known that single-year-of-age estimates 
can be relatively more volatile than those constructed in 
five-year age intervals [11,20]. A number of methods exist 
for making single-year-of-age estimates. Assuming mono- 
tonicity within five-year age intervals [21,22] and the 
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Stability of demographic processes over these short time 
intervals [18], demographers have historically made use of 
methods that break out five-year interval estimates into 
single years of age through pro-rating, osculatory 
interpolation, or the closely related procedure known as 
"spline-fitting" [11,20-30]. Pro-rating involves the allo- 
cation of the five-year data based on either historical or 
assumed proportions; for example, one might divide 
five-year estimates into single years based on the known 
distribution of the last census or based on an assump- 
tion of rectangularity (equal proportions of one-fifth) 
[11]. Osculatory interpolation, in contrast, relies upon a 
theory in mathematics that revolves around the unique 
solution of simultaneous equations using linear systems 
designed to minimize discrepancies between observed 
five-year data and the re-aggregation of single- year-of-age 
estimates into corresponding intervals [11,20,22-30]. 
Spline-fitting, simOar to osculatory interpolation, involves 
the overlapping of multiple polynomials to arrive at esti- 
mates of distributions through an optimization compo- 
nent based on the least-squares criteria [31]. The first two 
procedures have been the most widely applied within ap- 
plied demography; a rather long historical discussion of 
spline-fitting has not resulted in its general implementa- 
tion by demographers working in non-academic settings 
(such as state government) where functionally utilized 
population estimates are typically made. 

The purpose of this paper is not to contrast the accuracy 
of these methods; rather, we seek to implement commonly 
utilized methods to characterize the magnitude of errors 
associated with a typical set of estimates of population at 
risk likely to be utilized by small-area epidemiologists in 
practice. The focus, therefore, will be upon describing the 
range of errors that one might expect to see in such a set 
and analyzing how these errors might impact a set of 
crude-prevalence estimates made at a correspondingly 
fine-grained spatial scale (census tracts). To accomplish 
this purpose, data from the 2010 US Census are extracted 
(Summary file 1) for all census tracts (n = 499) within 
the state of New Mexico extracted from the American 
Factfinder website — [32]. The data extracted include a 
gold-standard set of single-year-of-age counts and the 
corresponding five-year grouped data for each census 
tract. Our evaluation is straightforward: we compare 
single-year-of-age estimates made using methods of pro- 
rating and osculatory interpolation of five-year grouped 
data to observed single-year-of-age 2010 Census counts 
and characterize the moments of the resulting ex-post 
facto error distributions using established methods within 
demography [6,8,10]. Next, we simulate a range of plaus- 
ible event prevalences using published estimates of child- 
hood obesity rates and use them to analyze the effects of 
observed errors in demographic estimates on estimates 
of prevalence per 1,000 person-years. The results are 



considered in light of practice in small-area epidemiologic 
surveillance and suggestions for further research and 
evaluation are made. 

Materials and methods 

Input data and study area 

New Mexico represents a diverse study area where tract- 
level variation in population characteristics can vary dra- 
matically in concordance with larger geographic trends at 
the county level. The state is characterized by highly ur- 
banized and rapidly growing metropolitan areas such as 
the cities of Las Cruces, Rio Rancho, and Albuquerque, 
dynamic and steady-growing small towns such as Roswell, 
Alamogordo, Clovis, and Farmington (just to mention 
four), vast sections of rural areas and the presence of 22 
tribal groups with long-standing historical presence in the 
state, numerous Colonias [3], and by an overlapping 
mosaic of historical Land Grant Communities linked to 
the Spanish Colonial Era and the period of Mexican 
Independence prior to New Mexico becoming a US 
territory in 1850 at the conclusion of the Mexican- 
American War. To review. New Mexico represents a 
microcosm of the demography of many communities 
throughout the United States as well as important and dis- 
tinctive populations. Each of these dynamics will be repre- 
sented at the Census tract level, providing substantial 
heterogeneity and material for analysis in the current con- 
text. Counts of age/sex-specific population in five-year in- 
tervals (0 to 4, 5 to 9, 10 to 14, 15 to 19, 20 to 24) and in 
single years (0 to 21) were extracted from the SFl file from 
the 2010 Census. Data were extracted at the census tract 
level (n = 499) for the entire state of New Mexico. Data 
were not considered for specific race/ethnicity group, with 
the data focused only on "all race" counts. 

Pro-rating and interpolation in demography 

In demography, the term "pro-rating" refers to the alloca- 
tion of grouped data into more fine- grained categories, 
such as decomposing five-year age-grouped data into sin- 
gle years as in the current analysis [11,20]. In this study, 
pro-rating serves as a baseline activity-simpler than the 
methods of polynomial interpolation described below 
but also dependent upon specific assumptions with little 
appealing mathematical theory underlying them [23,24]. 
Here, rectangular pro-rating is utilized in which the as- 
sumption is made that single-year age groups within 
any five-year age interval are equivalent: each single- 
year comprises one-fifth of the five-year age-grouped 
data [11]. As pointed out by Brass [23] and others [11,20] 
this method assumes that population processes — such as 
birth, death, and migration functions — are similar from 
year to year within the five-year age interval in question 
[21,22], i.e., that the single year data are monotonic in rela- 
tion to the five-year grouped data they produce [21,22]. 
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This simplifying assumption is unlikely to be true, and 
rectangular pro-rating is generally considered as a strategy 
to be implemented when no ancillary information on 
population dynamics is available at an appropriate 
geographic level [11,20]. 

The use of polynomial functions to describe relation- 
ships between time-ordered inputs and function-generated 
outputs has a long history within mathematics [33,34]. 
Their use in generating intermediate and unknown values 
within a dataset by interpolating between known values 
has an equally long history in applied fields such as clima- 
tology, economics, and demography [11,20-30]. Though 
polynomial interpolation approaches have been criticized 
in demography as being blind to population theory 
[20,23,24], in practice interpolation is easy to implement 
as many standardized formulas have been presented that 
involve only "plugging-in" of demographic data grouped 
in five-year intervals into predefined formulae to ar- 
rive at single-year-of-age estimates [11]. Figure 1 il- 
lustrates the relationship between single-year age 
structure and a polynomial function used to decompose 
five-year grouped data. 

As in Figure 1, an nth degree polynomial of the form: 

y = A + Bx + Cx^ + Dx^ + ... + Nx" 

may be fit to any curve for which some data points are 
known with certainty to arrive at estimates of intermedi- 
ate values. We may think of the interpolating polynomial 
as a system of equations, represented in terms of the 
well-known Vandermonde matrix (representing known 
values of demographic data), premultiplied against a 



vector of coefficients A to An, to yield interpolated 
values Yi as in the linear equation. Once solved, the 
function defined to estimate the yi is known as the inter- 
polant [18,19,21]. It is known that higher-degree inter- 
polating polynomials may often provide poorer fit of 
intermediate points, suggesting that simpler polynomial 
interpolants utilized by demographers may, in fact, 
provide more accurate estimates of single years of age 
[11,34,35]. 

Exact solutions to such approximating polynomials are 
difficult to implement using demographic data in five- 
year age groups [18,19,21]; however, their approximation 
through differencing formulas-those that minimize dif- 
ferences between five-year grouped counts and esti- 
mated values thereof using a polynomial function are 
well known and highly accurate in implementation [3]. 
An example is the Lagrange formula (from reference 
[11], page 683): 

f(x) = f(a)[((x-b)(x-c)(x-d)/(a-b)(a-c)(a-d))]+ 
f (b) [(( X -b)( X -c)(x -d)/(a-b)(a-c)(a-d))] 

which fits a polynomial of the form presented and passing 
through the two points a and b (which in this case are 
five-year grouped age counts) by minimizing differences 
between estimated values from the polynomial functions 
and these observed counts by shifting the values of the 
constants A, B, C, D, etc. [11,34]. In practice, the fitting of 
points f (x) are accomplished by inputting values of f (a) 
and f (b) into established formulas. This example of a La- 
grangian polynomial passing through two points may be 
generalized to as many points as desired, and various 
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Figure 1 Polynomial approximation of the distribution of single years of age. 
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methods of interpolation in demography rely upon 
differing numbers of points to achieve the desired fit. 
Osculatory interpolation is similar to the method of 
spline-fitting, also utilized in demography [21,22]; here 
we choose to focus on several methods of osculatory 
interpolation as better representing methods that are 
more typically used in practice among applied demog- 
raphers. This choice does not reflect methodological 
preference, but better suits the purpose of this paper, 
which is to characterize the magnitude of errors that 
practicing epidemiologists and demographers might 
expect to see in small-area, fine-grained (with respect 
to age) estimates of population at risk and their impacts 
on measures of epidemiologic risk. 

In this paper, we utilize several commonly imple- 
mented osculatory interpolation procedures including: 
the Karup-King [25,30], Beers 1 [11], Beers 2 [26], and 
Sprague methods [29]. These methods differ in the num- 
ber of points taken in the interpolation, with the Karup- 
King taking two differences, the Beers 1 and 2 focusing 
on four and six differences, and the Sprague method 
relying upon five. In general, previous studies in other 
fields [34,35] have suggested that the use of fewer points 
might enhance local accuracy in the interpolation 
[11,23-30], leading to a general hypothesis that the 
Karup-King may tend to out-perform alternatives. 

Statistical comparisons of error and model evaluation 
criteria 

Percentage discrepancies between the single-year-of-age 
estimates and corresponding 2010 Census counts form 
the basis of the evaluation reported in this paper, in 
accordance with the ex-post- facto evaluation method 
typically utilized by demographers [6-11]. Because 
demographic error distributions are calculated across 
geographic levels with widely differing population 
sizes, the use of percentage error is often encouraged 
[8,36] and is therefore employed here. Demographic 
estimate error distributions are characterized by non- 
normality and a frequent lack of symmetry [8,35], making 
it difficult to make statements about the range of variation 
in estimation accuracy or to determine what is or is not an 
extreme error value [8,10]. In this study, all statistical error 
distributions were found to deviate from normality using 
the Kolmogorov-Smirnov test at the alpha = 0.05 level. 
A simple non- parametric solution is to make use of the 
median as a summary measure of error and to utilize 
the percentile distribution between the 2.5th percentile 
and 97.5th percentile [37,38] to characterize precision; 
this is the strategy employed in this paper. These sum- 
mary measures are computed for each age/sex group, as 
well as across the entire range of ages within each sex. 
While this approach makes sense in light of the nature 
of the statistical error distributions employed in 



demography, there is a lack of consensus in the litera- 
ture about what constitutes a "better" estimate among 
available alternatives [8,10]. The perspective taken in 
this paper is to evaluate how much better one might do 
by employing a polynomial interpolation method than 
they would do by using a naive model based on simple 
rectangular pro-rating (assuming that one-fifth of the five- 
year age/sex count is within each single-year-of-age inter- 
val). This is the approach taken by Harper, Coleman, and 
Devine [39] as well as by Swanson and Tayman [40] in 
their "proportionate reduction in error" statistic. Because 
this paper relies upon summary statistics based on per- 
centages, models are evaluated in terms of: (1) the im- 
provement in percentage point error observed in each 
age/sex interval and (2) by the percentage point range 
between the 2.5th and 97.5th percentiles of the error 
distributions. The "best" fitting model, then, is determined 
to be the model that results in the greatest improvement 
in percentage accuracy over rectangular pro-rating and 
the lowest range of values between the 2.5th and 97.5th 
percentiles of the error distribution. 

Previous studies [9,19] of errors associated with 
demographic methods at the census tract level have in- 
dicated that over 10-year periods starting at the previ- 
ous census, a substantial amount of error may 
accumulate [8,41]. Errors in these studies have ranged 
between as low as 10% and as high as 80% within any 
age/sex five-year age grouping. For single-year-of-age 
estimates, it could be anticipated that errors could be 
larger than this, but isolating how much of this error 
would be due to the practices of pro-rating or polyno- 
mial interpolation would be difficult since errors in the 
five-year age/sex-grouped estimates would also affect 
the single-year-of-age estimates. To avoid this chal- 
lenge, in this study we utilize polynomial interpolation 
and pro-rating methods on known 2010 Census five- 
year counts. This practice isolates the error associated 
with the method by eliminating the conflation associ- 
ated with using uncertain five-year age/sex-specific esti- 
mates. The errors and error distributions reported in 
this study are due solely to those associated with the 
methods of pro-rating and polynomial interpolation 
that are the focus of the paper. 

The effect of errors in small-area demographic estimates 
on epidemiologic statistics 

Small-area epidemiology faces significant challenges in 
the geographic positioning of event data, through the 
process of geocoding [42-47], necessary for calculating 
epidemiologic statistics such as incidence, prevalence, etc. 
These issues should also be anticipated to be important in 
making inferences associated with analytic epidemiology 
[1-5,12-18,48-50], but they are beyond the scope of the 
current paper, which will examine only the effects of 
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small-area demographic estimation error on surveillance 
statistics. To assess the impacts of errors in demographic 
estimates, the paper used a simple simulation-based ap- 
proach to analyze the sensitivity of small-area crude preva- 
lence estimates within each single-year-of-age grouping. 
The "best-performing" set of demographic estimates for 
each sex is utilized as a denominator in calculating risk 
measures. Event counts were simulated using childhood 
obesity (a common event whose prevalence has been 
estimated to be as high as 1/5 or 200/1000 persons) as an 
example. The distribution of prevalences was estimated 
using a Monte-Carlo simulation [51,52] in the R statistical 
package that assumed: (1) normality and symmetry of 
the prevalence distribution, (2) an average prevalence of 
17.5%, and (3) a standard deviation of 2.5%. This distribu- 
tion was resampled 10,000 times, with a burn-in period of 
500 iterations and thinning to include only every 100th 
observation to avoid commonly known challenges related 
to autocorrelation of randomly generated number algo- 
rithms [51,52]. The resulting distribution of prevalence 
was used to estimate the 2.5th and 97.5th percentiles 
for use in the simulation. These points were then used 
to simulate case events for each census tract/sex/age 
grouping. Median differences between crude prevalence 
estimates of risk per 1,000 person-years calculated using 
2010 Census counts and the demographic estimates of 
population at risk as alternatives were computed. Variabil- 
ity in terms of the errors associated with risk per 1,000 
person-years were then assessed using the 2.5th and 
97.5th percentiles in light of observed non-normality and 
asymmetry in the distributions of these differences. 

Results 

Errors in single-year-of-age small area estimates of 
population at risk 

For males, the simplest interpolation method-the Karup- 
King procedure-produced the smallest errors for the most 
age groups. For nine out of 21 age intervals, this method 
was found to be the most accurate available method 
(Table 1). Use of this method would reduce error in com- 
parison to the rectangular pro-rating method by as much 
as 46.30 percentage points (age 16) or as little as only 0.18 
percentage points (age 6). On average, use of the Karup- 
King method would improve estimation accuracy by 7.82 
percentage points over the rectangular pro-rating method. 
It is worth noting that the performance of the Karup-King 
method is similar to that of either the Beers 1 or Beers 2 
methods, meaning that the sensitivity of epidemiologic 
statistics to a typical set of demographic estimates may 
be similar even when different methods are utilized — 
especially when we consider that each of these methods 
differs widely in the number of data points used in the 
interpolation. In contrast, the Sprague method provided 
much less reduction in error on average when compared 



to rectangular pro-rating (4.49 percentage points) and in a 
number of cases (ages 0, 4, 6, 9, 11, 14, and 21) actually 
provided a less accurate estimate than that observed when 
using simple rectangular pro-rating. Some of the increases 
in error are substantial when using the Sprague method, 
suggesting that the increased degree of this polynomial 
may be associated with poorer fitting of intermediate 
values as noted in studies in other fields [34,35]. These re- 
sults appear to hold for males in terms of the precision of 
the methods. While all methods showed a wider range of 
error percentile distributions than is desirable (frequently 
difference between the 2.5th and 97.5th percentiles 
exceededlOO percent), the Karup-King likewise was 
consistently the smallest (median = 131.59 percentage 
points, lowest = 77.74, highest = 164.44). In 18 out of 21 
cases, the Karup-King estimates were more accurate 
than using simple rectangular pro-rating. 

Among females (Table 2), much less clear differences 
were observed in estimation accuracy across the available 
methods. All of the interpolation-based methods out- 
performed rectangular pro-rating in most cases: 16/22 
with Karup-King, Beers 1 and Beers 2, and 13/22 for the 
Sprague method. The average reduction in error across 
the ages was greatest for the Beers 2 procedure, which 
reduced errors by over 4% on average; however, the reduc- 
tions in error were within 1 to 1.5 percentage points 
across all of the alternatives. It is noteworthy, however that 
the specific ages in which each method performed best 
and the magnitude of reductions at each age across the 
methods varied. The only estimates that appeared to sig- 
nificantiy increase bias were those made with the Sprague 
interpolants (as was observed in males), which increased 
errors by 65 percentage points among 9-year-old females 
and by 39 points among 14-year-olds. Overall, the range 
of errors associated with each procedure were extremely 
similar, though the Beers 2 procedure again out- 
performed very marginally. For the Beers 2 procedure, 
the difference between the 2.5th and 97.5th percentiles 
ranged from a low of 99.84 percentage points (13-year- 
olds) to a high of 239.22 percentage points (16-year-olds). 

A striking feature of the results is that demographic 
estimates of single-year-of-age population at risk at the 
Census tract level appear to be similar across the differ- 
ent methods utilized and to contain a surprising level of 
inaccuracy and a very large range of values across the 
set. We defined the "best" set as the alternative with the 
greatest reduction in error over simple rectangular pro- 
rating and the least observed spread between the 2.5th 
and 97.5th percentiles of the error distribution. The 
best-fitting set of estimates were utilized in analyzing the 
sensitivity of small-area crude prevalence measures to 
errors in these estimates. The best-fitting set for males 
was the Karup-King (two differences), while the Beers 2 
(six differences) was utilized for females. 



Table 1 Summary characteristcis of percentage errors in estimates of single-years of age based on pro-rating or Interpolation (males) 
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-37.66 


1 20.46 


26.75 


-40.84 


1 68.68 


16 


47.30 


-40.61 


162.86 


-1.00 


-26.93 


65.02 


842 


-35.54 


1 58.79 


3.68 


-36.85 


136.91 


3.78 


-2841 


83.00 


17 


30.17 


-34.66 


164.14 


3.26 


-31.41 


87.35 


7.58 


-35.68 


164.13 


8.71 


-33.85 


162.80 


-1.56 


-43.28 


82.18 


18 


45.32 


-36.48 


114.29 


-1.15 


-29.55 


54.49 


543 


-40.79 


135.05 


3.99 


-38.22 


1 38.66 


-10.71 


-45.76 


47.68 


19 


34.83 


-35.66 


177.50 


1.88 


-30.52 


1 00.87 


6.84 


-44.56 


248.20 


6.96 


-43.93 


239.47 


-8.38 


-60.73 


87.21 


20 


21.96 


-62.84 


317.68 


-3.40 


-25.60 


52.14 


2.17 


-48.13 


238.03 


1.74 


-48.45 


243.14 


-10.26 


-46.19 


36.94 


21 


0.26 


-60.00 


340.75 


-0.25 


-36.62 


78.02 


-1.54 


-52.60 


280.56 


-1.32 


-53.54 


286.1 1 


-5.02 


-48.93 


73.72 



Table 2 Summary characteristics of percentage errors in estimates of single-years of age based on pro-rating or interpolation (females) 







Rectangular 






Karup-KIng 






Beers 1 






Beers2 






Sprague 




Age 


Median 


2.5 
percentile 


97.5 
percentile 


Median 


2.5 
percentile 


97.5 
percentile 


Median 


2.5 
percentile 


97.5 
percentile 


Median 


2.5 
percentile 


97.5 
percentile 


Median 


2.5 
percentile 


97.5 
percentile 


0 


5.24 


-55.00 


1 20.00 


-4.80 


-54.89 


143.03 


-2.98 


-64.50 


1 40.04 


-5.54 


-68.17 


140.19 


10.54 


-52.52 


171.40 


1 


6.43 


-51.63 


107.14 


-1.39 


-50.1 1 


1 15.24 


-1.50 


-54.54 


1 17.31 


-1.64 


-54.12 


1 17.51 


2.39 


-50.51 


121.98 


2 


5.74 


-53.90 


1 1 0.43 


-2.28 


-47.15 


109.13 


-3.27 


-54.22 


109.51 


-2.13 


-53.87 


1 10.32 


-6.12 


-57.83 


107.1 1 


3 


1 1.99 


-45.80 


142.78 


3.52 


-40.53 


133.83 


2.54 


-49.59 


1 26.55 


3.80 


-48.24 


147.04 


-4.85 


-57.06 


125.31 


4 


1 1.21 


-52.30 


1 56.83 


-0.03 


-40.49 


135.02 


-0.37 


-48.10 


13534 


1.10 


-45.55 


137.51 


-7.37 


-55.81 


124.70 


5 


8.07 


-41.94 


91.98 


1.93 


-39.25 


98.74 


1.94 


-50.73 


1 04.57 


2.38 


-47.82 


1 04.86 


-5.23 


-58.29 


93.99 


6 


4.77 


-44.66 


98.36 


-3.02 


-41.15 


1 06.23 


-3.16 


-47.49 


107.20 


-3.00 


-45.52 


103.03 


-7.04 


-5 1 .00 


99.87 


7 


9.79 


-43.72 


1 1 1.58 


0.58 


-40.20 


1 12.41 


0.45 


-45.25 


125.21 


0.95 


-43.39 


105.69 


0.49 


-44.60 


1 1 1.41 


8 


-0.72 


-47.39 


78.57 


-5.21 


-40.72 


70.79 


-5.18 


-44.54 


71.45 


-6.67 


-46.44 


57.19 


-1.85 


-41 .48 


74.83 


9 


7.05 


-45.33 


1 13.07 


-0.15 


-35.08 


1 06.77 


-0.62 


-42.1 1 


1 02.59 


-0.47 


-43.55 


105.33 


-72.83 


-133.58 


-32.05 


10 


-0.90 


-41.00 


76.98 


-5.05 


-39.12 


94.82 


-7.26 


-44.71 


83.96 


-6.39 


-43.77 


84.55 


4.1 1 


-35.42 


109.72 




4.43 


-37.85 


92.87 


-3.94 


-34.85 


90.57 


-5.34 


-42.50 


89.47 


-4.34 


-37.71 


91 .71 


1 2.24 


-26.93 


1 28.00 


12 


-4.83 


-53.24 


71.06 


-10.60 


-44.51 


68.97 


-10.50 


-55.49 


75.22 


-9.98 


-49.91 


57.56 


1.60 


-47.79 


95.63 


13 


-1.95 


-40.00 


70.45 


-6.70 


-3840 


68.52 


-5.50 


40.32 


55.96 


-443 


-36.25 


53.58 


-12.50 


-45.61 


55.92 


14 


0.27 


-55.50 


87.48 


-5.21 


-52.31 


90.57 


-5.12 


-52.52 


85.92 


-4.75 


-51.55 


8533 


-39.55 


-73.16 


20.13 


15 


5.56 


-32.03 


80.00 


-2.82 


-54.05 


85.89 


-0.48 


-39.19 


88.39 


-045 


-39.15 


88.66 


15.01 


-48.14 


11338 


16 


14.99 


-58.23 


157.67 


4.00 


-58.21 


139.07 


10.42 


-62.22 


1 78.90 


5.09 


-67.18 


172.05 


849 


-71.40 


143.48 


17 


19.23 


-55.82 


148.02 


8.91 


-56.51 


150.10 


935 


-58.15 


151.78 


940 


-55.82 


153.00 


4.78 


-74.34 


143.84 


18 


19.70 


-52.80 


1 79.03 


2.92 


-68.88 


1 8640 


5.93 


-60.54 


1 62.52 


3.87 


-5942 


158.77 


-4.80 


-85.64 


180.27 


19 


21.00 


-57.88 


163.57 


8.91 


-58.52 


149..85 


849 


-46.05 


132.31 


7.75 


-46.20 


12938 


-2.53 


-85.99 


145.00 


20 


9.05 


-50.10 


108.57 


10.31 


-65.92 


237.83 


8.07 


-52.24 


1 7249 


8.02 


-52.35 


151.94 


1.42 


-88.64 


244.10 


21 


9.39 


-29.52 


106.57 


12.76 


-63.30 


277.40 


8.25 


-30.33 


124.11 


748 


-28.39 


119.77 


6.83 


-80.50 


255.51 
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Impact of errors on crude prevalence estimates 

The effect of demographic estimation errors (Table 3) 
are relatively small at the lower end of the prevalence 
spectrum (2.5th percentile), on average never account- 
ing for more than a difference of a few people in a crude 
prevalence estimate indexed at 1,000 person-years. 
Though the differences vary between the sexes in terms 
of the specific ages in which the larger errors are observed, 
similar differences in general were observed for male and 
female estimates. For males, median differences ranged 
from a high of -10 persons per 1,000 person-years to a 
low of effectively 0. Similarly, among females the highest 
observed median error was 14 persons and the lowest also 
effectively 0. The observed error distributions in both sets 
were asymmetrical, with a very large amount of variability 
observed in terms of the range of effects of observed. This 
is due both to high variability and the presence of notable 
outliers in both sets. Among male single-year-of-age 
estimates, the difference between the 2.5th and 97.5th 
percentiles ranged from a low of 99 persons per 1,000 
person-years to a high of 210 persons per 1,000 person- 
years. Among females, even greater large-scale variability 
was observed with differences between the 2.5th and 
97.5th percentiles ranging from a low of 145 to a high 
of 334. 



At higher levels of simulated prevalence (97.5th percent- 
ile), both the median differences and the range of values 
between the 2.5th and 97.5th percentiles were both ob- 
servably larger (Table 4). While this may be accounted for 
by the differences in the frequency of events (the 2.5th 
percentile of the simulated prevalence distribution is 
12.57% and the 97.5th is 22.47%-amounting to nearly a 10 
person difference per 1,000 person-years), the observa- 
tions are striking. Errors range among males from a low of 
effectively zero to a high of -18 persons per 1,000 person 
years. Similarly, among females errors range from between 
a low of effectively 0 to a high of 25 persons per 1,000 
person-years. In both cases, the range of differences per 
1,000 person-years is nearly double that observed among 
the lower prevalence-based estimates. Among males, the 
errors range between a low of 178 persons per 1,000 per- 
son years to a high of 378 persons per 1,000 person years. 
Among females the differences between the 2.5th and 
97.5th percentiles are even larger, ranging between a 
low of 215 persons per 1,000 person years to a high of 
601 persons per 1,000 person-years. 

Discussion 

To our knowledge, this paper represents the first pub- 
lished documentation of the magnitude or distribution of 



Table 3 Percent-point impact of estimation errors, 2.5th percentile of stimulated prevalence 



Male Age 


Difference/ 
1,000 Median 


Difference/1,000 
2.5th percentile 


Difference/1,000 
97.5th percentile 


Female Age 


Difference/ 
1,000 median 


Difference/1,000 
2.5th percentile 


Difference/1,000 
97.5th percentile 


0 


-6 


-58 


92 


0 


0 


-55 


89 


1 


-8 


-67 


75 


1 


5 


-75 


219 


2 


-2 


-60 


53 


2 


2 


-57 


147 


3 


-6 


-64 


61 


3 


3 


-55 


145 


4 


2 


-50 


61 


4 


-b 


-74 


116 


5 


-5 


-67 


73 


5 


-1 


-72 


105 


6 


4 


-56 


82 


6 


-3 


-54 


114 


7 


-1 


-67 


85 


7 


4 


-63 


104 


8 


1 


-62 


98 


8 


-1 


-55 


91 


9 


-2 


-65 


86 


9 


9 


-50 


108 


10 


0 


-57 


58 


10 


0 


-54 


91 


11 


-5 


-66 


91 


11 


8 


-50 


96 


12 


0 


-63 


74 


12 


5 


-50 


75 


13 


-5 


-66 


108 


13 


14 


-50 


124 


14 


-3 


-68 


142 


14 


5 


-48 


71 


15 


-10 


-67 


135 


15 


5 


-57 


134 


16 


1 


-49 


55 


15 


1 


-59 


80 


17 


-5 


-59 


71 


17 


-7 


-79 


255 


18 


1 


-48 


64 


18 


-11 


-75 


158 


19 


-2 


-63 


69 


19 


-b 


-78 


183 


20 


4 


-44 


54 


20 


-9 


-40 


107 


21 


0 


-56 


89 


21 


-9 


-77 


137 
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Table 4 Percent-point impact of estimation errors, 97.5th percentile of stimulated prevalence 



Male Age 


Difference/ 
1,000 median 


Difference/1,000 
2.Sth percentile 


Difference/1,000 
97.5th percentile 


Female Age 


Difference/ 
1,000 median 


Difference/1,000 
2.5th percentile 


Difference/1,000 
97.5th percentile 


0 


-1 1 


-104 


165 


0 


0 


-101 


151 


1 


-15 


-118 


135 


1 


12 


-137 


395 


2 


-3 


-109 


96 


2 


4 


-121 


265 


3 


-1 1 


-1 15 


1 10 


3 


5 


-118 


252 


4 


3 


-90 


110 


4 


-8 


-134 


209 


5 


-8 


-121 


131 


5 


-2 


-130 


189 


6 


7 


-101 


148 


6 


-5 


-115 


206 


7 


-1 


-120 


153 


7 


7 


-114 


188 


8 


2 


-1 1 1 


1 76 


8 


-2 


-118 


163 


9 


-4 


-1 18 


155 


9 


16 


-90 


1 9b 


10 


0 


-103 


122 


10 


1 


-116 


154 


1 1 


-10 


-120 


163 


1 1 


15 


-107 


173 


12 


0 


-1 14 


134 


12 


10 


-107 


136 


13 


-9 


-120 


194 


1 3 


25 


-91 


224 


1 /I 
4 


-o 


- I ZD 


ZJJ 


1 4 


1 n 
1 U 


-o/ 


1 ")Q 
1 ZO 


15 


-18 


-120 


243 


15 


11 


-103 


240 


16 


2 


-88 


99 


16 


1 


-106 


144 


17 


-9 


-107 


129 


17 


-13 


-142 


459 


18 


3 


-86 


116 


18 


-19 


-136 


284 


19 


-4 


-113 


125 


19 


-8 


-141 


329 


20 


8 


-80 


98 


20 


-16 


-127 


193 


21 


0 


-101 


161 


21 


-17 


-139 


247 



anticipated errors in small-area demographic estimates by 
single years of age or their effects upon epidemiologic 
statistics. The observed magnitude of errors is large; in 
fact, in most cases the differences are large enough that 
it would be difficult to rule out average differences in risk 
between groups since their distributions are so likely to 
overlap. The range of observed errors are clearly problem- 
atic for making public health decisions. While it is obvious 
that scaling risk to 1,000 person years would garner sub- 
stantial attention, even rescaling these statistics to 100 per- 
son years (arguably more appropriate for small-area work) 
does not solve this issue. For example, even if rescaled to 
100 person-years a difference as large as 92/1,000 person- 
years would suggest a difference in risk of 9.2 persons/ 100 
person-years. This is almost certain to trigger action by 
public health officials. In this respect, these results are un- 
settling because they suggest that errors in demographic 
estimates are likely to frequently have important impacts 
on how we utilize epidemiologic statistics for small areas. 
In this study, we simulated prevalence for a common con- 
dition (childhood obesity), but even after capturing a rea- 
sonable range of variation in event occurrence, the impact 
of demographic estimation errors was large enough to be 
of considerable concern. 

It may be of some comfort to imagine that in the case 
of rarer events (such as childhood cancer, estimated to 



impact perhaps 1/10,000 children), the accuracy of demo- 
graphic estimates should have little impact on public 
health decision-making. In this circumstance, even a single 
case of cancer should be noteworthy and a clustering of 
events should be identifiable indifferent of estimates of 
population at risk. The results presented here, however, 
should caution epidemiologists and public health officials 
of the potential uncertainties introduced by the use of 
demographic estimates for population at risk, though it is 
worth noting that using the previous decennial census 
counts has been shown to introduce even greater error 
than using postcensal estimates [8-12]. 

This study has assumed that epidemiologic events are 
captured completely. In reality, estimates of census 
tract-level events depend upon the process of geocod- 
ing, by which events are placed on electronic maps and 
then re-aggregated to summarize them at the tract level 
[41,53-55]. Previous studies have suggested that geocoding 
rates can vary from lows of 40% or less to highs ap- 
proaching 90 to 95% [56-58]. These results vary across 
rural/urban strata and it is known that incomplete geo- 
coding is systematic, spatially-dependent, and can bias es- 
timates of important demographic characteristics such as 
race and ethnicity [42-44,58]. Haining [45] has pointed 
out that such incomplete geocoding is unignorable in the 
statistical sense [46] and a large number of studies have 
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attempted to fill in spatially-dependent gaps in coverage 
through a variety of methods [6,47]. At least one study [7] 
has attempted to quantify the magnitude of errors intro- 
duced into small-area population estimates by incom- 
plete geocoding. These authors suggested average errors 
attributable to geocoding to be approximately 9.0%, but 
also observed that approximately 10% of errors in total 
population estimates exceeded 20% and a surprising 
amount (nearly 4%) actually exceeded 50% error. To 
date, no study has estimated the impact of incomplete 
geocoding on estimates of age/sex structure or those 
with single-years-of-age, but we can expect that when 
postcensal estimates are used during periods between 
censuses rather important errors can be anticipated in 
both numerator (geocoding of events) and denominator 
(based on geocoded demographic indicators). 

Temporal drift in demographic estimation accuracy 
should also be considered. It is largely unknown how the 
accuracy of demographic estimates may drift over time be- 
tween censuses [8], but it is clear that it does decay over 
time as the time period estimating gets further away from 
the previous census [8]. In this study, single-year-of-age 
estimates were made by breaking out actual 2010 census 
counts in five-year age/sex-specific intervals. While it is 
debatable that census counts represent any sort of "gold 
standard" [36,39,40] it also highly plausible that they are 
closer to reality than any demographic or survey-based 
estimate can ever be at the point in time in which the 
enumeration takes place. In practice, demographic esti- 
mates of population at risk for five-year age/ sex intervals 
will display their own errors, which will in turn propagate 
into those made for single-years-of-age. It is beyond the 
scope of this study to examine this drift and, in fact, any 
study aiming to do so is faced with the challenge that no 
estimates even approaching a gold standard exist for years 
between censuses. Ex- post-facto evaluations [8,9,41] sug- 
gest that errors in five-year age categories can be as high 
as 80% at the census tract level and it is unknown if these 
errors may offset when applied to single-year-of-age cat- 
egories. For epidemiologists seeking to use demographic 
estimates of population at risk, postcensal drift in accuracy 
is a real, if immeasurable, possibility. 

In spite of the potential limitations highlighted in this 
study, it is worth considering that alternatives may do no 
better and may actually be worse than using demographic 
estimates to capture population at risk. Previous studies 
have indicated that using the previous census values, for 
example, can produce errors that are even larger in magni- 
tude than those observed in demographic estimates [7,8]. 
Not updating estimates of population at risk from the pre- 
vious census is generally not advisable either and intro- 
duces an additional liability associated with not capturing 
changes that are important to understanding the popula- 
tion dynamics that ultimately produce epidemiologic risk. 



In terms of single-year-of-age estimates of population at 
risk (such as for a typical census tract of about 1,500 
persons), it is likely true the number of persons within a 
specific age/sex interval will be small enough that even 
the errors observed here will have little effect on esti- 
mates of prevalence. On balance, we would argue that 
updating is preferred over use of the previous census. 
Furthermore, previous studies indicate that simple trend 
extrapolations (in which historical trends are carried 
forward) are similarly inaccurate to those produced 
using other methods [7-9], again recommending the use 
of demographic estimates for population at risk in epi- 
demiologic statistics. 

It is likely that readers of this paper will be surprised 
by the magnitude of error and its variability observed in 
this research. It is clear that errors in demographic esti- 
mates may introduce important limitations in small-area 
epidemiologic statistics, and this challenge has not re- 
ceived enough consideration in the literature. This paper 
should serve to spur interest in further evaluative studies 
as well as introducing motivation for applied demogra- 
phers to resume exploration of novel methods in small- 
area demographic estimation in search of more accurate 
alternatives [7-9,59]. Both descriptive and analytic epi- 
demiology depend upon not only accurate estimates of 
risk but also accounting for potential bias or uncertainty 
in these estimates [49,50]. From this perspective, this 
paper suggests that a much more detailed consideration 
of how error is propagated into small-area epidemiologic 
statistics is in order. Such an analysis must include an 
assessment of errors, uncertainties, and bias in both geo- 
coding (numerator) and demographic estimates (deno- 
minator) and this paper suggests some potentially useful 
ways to approach this challenge. 
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